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ABSTRACT 

Use of item response theory (IRT), the delta plot 
method, and Mantel-Haenszel techniques to assess differential item 
functioning (DIF) across racial and gender groups associated with the 
Maryland Test of Citizenship Skills (MTCS) is described. The 
objective of this research was to determine the: effect of sample 
size on results from these three DIF techniques; degree of 
relationship among these DIF statistics; and degree to which they 
identify the same items as biased. The data for the study include 
Item responses from one form of the 1989 edition of the MTCS. The 
MTCS consists of 45 multiple-choice items that assess students* 
knowledge and skills in 3 domains: constitutional government; 
politics and political behavior; and principles, rights, and 
responsibilities. The MTCS was administered to 50,000 ninth graders 
during January and February of 1988. The analyses were performed on 
representative samples of 1,000, 750, 500, and 200 first-time test 
takers. It is concluded that no MTCS items are functioning 
differentially in either black/white or male/female comparisons. 
Plots of item difficulty estimates for black/white and male/female 
comparisons show nearly perfect linear relationships in both groups. 
Agreeirent, as indicated by rank order correlations across DIF 
techniques, is very high between Rasch and Delta Plot DIF indices for 
all sample sizes in both black/whire and male/female comparisons. In 
terms of agreement regarding biased and unbiased items, agreement 
with the three-parameter DIF index is highest for the Delta Plot and 
Rasch techniques. A 30-item list of references, 19 data tables, and 
30 figures dre included. (TJH) 
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The study of test items that function differently for 
subpopulations of examinees is of concern to test 
developers. This concern is especially critical in 
competency-based testing, where graduation certification is 
contingent on passing one or more tests. Differential item 
functioning (DIF) was originally called item bias. Many 
researchers have attempted to define it clearly. In 
educational measurement, the term bias is used in reference 
to tests and their use usually for selection and 
classification, or individual items and their effect on 
total test scores. Test items may be considered biased when 
a minority group scores disproportionately lower than a 
reference group due to factors other than ability, cleary 
and Hilton (1968) defined item bias as an interaction 
between item and group in terms of analysis of variance 
procedures. Angoff and Ford (1973) considered an item 
biased if the item difficulty index is significantly higher 
or lower in one group than in another group. Scheuneman 
(1979) considered an item biased if, for all examinees 
having the same score on a test that includes that item, the 
proportion of examinees answering tb'j item correctly is 
substantially different for various subpopulations being 
considered. 
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Clearly, the definition of item bias is dependent in 
part upon the techniques that are used to find 
differentially performing items. For example, when using 
item response theory to investigate item bias, an item is 
considered unbiased if the item characteristic curves (iccs) 
for the item are the same for both subpopulations (Crocker & 
Algina, 1986, p. 377). In that case, among individuals with 
the same ability score, the items are equally difficult for 
members of both subpopulations. Somewhat similarly, in chi- 
square techniques, an item is considered unbiased if within 
a group of individuals with total test scores in the same 
test score interval, the proportion of individuals 
responding correctly to the item is the same for 
subpopulations (Crocker &. Algina, 1986, p. 383). 
Transformed item difficulty techniques (e.g.. Delta Plot) 
base the definition of DIF on the notion that, when items 
are ranked according to difficulty, unbiased items will be 
ordered the same in two compared groups. The assumption 
here is that bias is indicated by a significant group 
difference in the relative difficulty of an item rather than 
by a large group difference in item difficulty means and 
standard deviations (Osterlind, 1987, p. 28). A widely 
accepted definition for DIF is that an item is considered 
unbiased if examinees with equal ability, but from different 
subpopulations, have equal probability of responding 
correctly to the item (Angoff , 1982) . 
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A variety of techniques for detecting DIP have been 
proposed in the literature. Hills (1977, 1981, 1982) 
identified more than 40 techniques for this purpose and 
grouped them into nine genv^ral types: (i) methods for 
comparing plots of transformed item difficulties; (2) 
analysis of variance methods; (3) chi-square methods; (4) 
foil methods, which involve examining the differential 
response patterns of various groups of examinees to it^.i 
foils in order to find alternatives which overly attract or 
repel a particular group; (5) correlation methods, which 
involve comparison of the reliabilities of a test when the 
reliability is estimated for each group separately; (6) item 
response theory methods; (7) factor analysis methods; (8) 
methods based on experimental manipulations; anrf (9) 
construction methods to ensure unbiased tests. 

These techniques are different but are concerned with 
similar concepts of bias. They often produce different 
results because of theoretical and practical reasons. Thus, 
many studies of DIP techniques in the past several years 
have been devoted to comparing different techniques. The 
numerous techniques proposed for detecting DIP have been 
narrowed down in recent years to several of the most 
promising. There exist several comprehensive reviews of the 
DIP literature (Burrill, 1982; Ironson, 1982; Rudner, 
Get son, &. Knight, 1980; Osterlind, 1987; and Shepard, 
Camilli, & Williams, 1985). The consensus from this 



5 



DIF Techniques 
4 

research is that "the ICC approach is the most generally 
valid of all biased item detection methods" (Osterlind, 
1987, p 69) . Item response theory (IRT) techniques are the 
theoretically preferred procedures for detecting DIF because 
they least confound real mean differences in group 
performance with bias (Lord, 1977). The sample invariance 
property of iRT provides a theoretical framework of how DIF 
is defined and detected in a test. iCCs describe the 
relationship between item difficulty and examinee ability in 
terms of the probability of responding correctly, if an 
item has the same meaning in two comparison groups, then the 
probability of a correct response should be the same for 
examinees of equal ability from different groups. Although 
the IRT approach is superior theoretically, there are 
practical problems in using it. For example, iRT computer 
programs are costly and complicated to use. In addition, 
the three-parameter model requires a minimum of 1,000 cases 
per group (i.e., for LOGIST) to estimate item parameters, a 
requirement that often is difficult to meet in minority 
samples. As a result, other techniques that are not limited 
by difficult sample size requirements have been developed; 
for example, chi-square techniques that are considered 
approximations to item response theory techniques 
(Scheuneman, 1979; Holland & Thayer, 1986). An advantage of 
chi-square techniques is that they are easier to apply than 
IRT techniques and do not require large sample sizes. 
However, the relationship between the size of the Mantel- 
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Haenszel chi-square and sample size has not received 
attention beyond mere mention (see, for example, Raju, Bode, 
& Larsen, 1989, p. 12) . DIF procedures that are clearly 
recommended in the xiterature are irt methods, Mantel- 
Haenszel chi-square techniques, and Angoff 's delta plot 
method (Shepard ct al., 1985, p. 84). 

In this paper e describe briefly and compare three 
techniques for detecting DIF: item response theory (IRT, 
using the three parameter model and the Rasch model). Delta 
Plot, and Mantel-Haenszel (MH) chi-square techniques. We 
compare IRT and Mantel-Haenszel approaches because they are 
reputed to produce similar results (see Rudner, Getson, & 
Knight, 1980) . We include the delta plot technique because 
it has been recommended as an alterne ::ive in situations 
where sample size or other practical considerations preclude 
the use of irt or chi-square methods (Subkoviak, Mack, 
Ironson, & Craig, 1984). The objective of this research is 
to determine the (1) effect of sample size on results from 
these three DIF techniques, and (2) degree of relationship 
between these DIF statistics, and (3) degree to which they 
identify the same items as biased. 



Method and Procedures 
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Data Source 

The data for this study are item responses from one 
form from the 1988 edition of the Maryland Test of 
Citizenship skills (MTCS) . The MTCS consists of 45 multiple- 
choice items that assess students' knowledge and skills in 
three domains: Constitutional Government; Principles, 
Rights, and Responsibilities; and Politics and Political 
Behavior. Annual forms of the MTCS are constructed by • 
sampling items from a large bank that has been calibrated 
using the Rasch model. Students must pass the MTCS, along 
with three other minimum-competency tests, in order to 
receive a Maryland high school diploma. The Maryland 
Functional Testing Program (MFTP) uses two approaches for 
detecting DIF: judgmental reviews and statistical analysis. 
Before newly written items are field tested, experts in 
ethnic and sex bias review their language and the situations 
they pose for potential source© of bias. The Delta Plot 
technique is used as a post-administration check for 
differentially functioning items. Flagged items are 
exr.'ined for potential causes of DIF before they are 
included in a student's score on the MTCS, and later 
resubmitted for review by bias specialists. 

Construction of samples. The MTCS was administered to 
approximately 50,000 9th grade students during January and 
February, 1988. The analyses are performed on 
representative samples of looo, 750, soo, and 200 first-tima 
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test takers. Random comparison groups (referred to as 
"random 1" and "random 2") of each of the four sample sizes 
were created by randomly selecting cases from the entire 
pool. White-black and male-female comparison groups of each 
of the four sample sizes were created by randomly sampling 
cases from within race and sex strata. 

A critical assumption made for DIF techniques is that 
the test under scrutiny is unidimensional ; that is, that all 
items measure the same latent ability, skills, and so forth. 
Investigating the unidimensionality assimption is 
problematic because experts do not agree on appropriate 
methodology and criteria for testing this assumption. In 
this study, the recommendations of Reckase (1979) for 
determining unidimensionality of a test were used as 
follows: (1) In a factor analysis of test items, the first 
unrotated principal component should account for at least 20 
percent of total test variance; (2) The eigenvalue for the 
first principal component should be large relative to the 
eigenvalue for the next largest component. 

In the next section of this paper we describe 
procedures for detecting DIF using IRt, Delta Plot, and MH 
techniques. In subsequent sections we describe results from 
implementing these three techniques in MTCS items and draw 
conclusions about the stability and agreement of the 
results. 
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PIF Technicmes. Methods of Analysis, and Procedur 



es 



IRT Tec hniques and py n^Pt^i^roc 

According to item response theory, item parameter 
estimates are invariant with regard to the group used in the 
estimation. If an item's parameter estimates are different 
for different groups, . according to the theory, then the item 
must be measuring more than a unidimensional ability assumed 
by the model. Therefore, item parameters that vary across 
subgroups indicate DIF. In this study we use graphical 
analysis for descriptive purposes and differences between 
ICCs to detect differentially functioning items. 

Graphical analysis. Graphical analysis involves 
plotting difficulty estimates for each item (and 
discrimination estimates in the three parameter model) for a 
focal group (i.e., blacks, females) versus a reference group 
(i.e., whites, males), item difficulty and discrimination 
estimates for comparison groups (blacks vs. whites and males 
vs. females) were plotted separately. This graphical 
analysis is recommended by Hambleton (1982) as a simple 
method to detect potentially biased items. Theoretically, 
if the item is functioning the same in both groups the 
difficulties in both groups should be identical, except for 
estimation and sampling error, and plotted points should 
tightly hug a best-fitting line. 
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Differences betw een iccs . We examine differences 
between iccs from the three parameter model and the Rasch 
model separately. Examining three parameter ICC differences 
involves six steps. First, item parameters for all items 
are estimated for two random groups using the PC-BILOG 
program (Mislevy & Bock, 1986) . DIF results from analysis 
of these parameters provides a criterion for distinguishing 
real DIF, which may be caused by some form of bias against a 
subgroup, and apparent DIF due to sampling error. Second, 
item parameters were estimated for all items separately for 
each reference and focal group. Third, item parameters for 
the reference and focal groups were linearly transformed to 
the same scale. Fourth, using the item parameter estimates 
for the two random groups in step one as input, difference 
was calculated between the two ICCs for each item. Fifth, 
the absolute difference between the ICCs for each items and 
the mean, standard deviation, and 99 percent confidence 
interval for these absolute differences were found. 
Finally, confidence intervals were used as a baseline to 
identify extreme differences in the ICCs found with the 
majority and minority groups: any difference not contained 
in the confidence interval was considered an indication of 
DIF. 
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The BICAL computer program (Wright, Mead, & Bell, 
1979) -was used to estimate Rasch model icem difficulty 
parameters. Rasch model ICCs were also compared following 
the six steps described above. However, since Rasch model 
ICCs are parallel, the area between the ICCs for the same 
item in two groups is equal to th difference in the item 
difficulties (see Phillips & Mehrens, 1988). 

Delta Plot Tech nique and Procedures 

This Delta Plot technique introduced by Angoff (1972), 
is based on an item-by-group interaction as a measure of 
DIF. This method can produce spurious evidence of bias 
unless all items are equal in discrimination or the groups 
being compared do not differ in average performance. To 
solve this problem Angoff (1982) proposed modifications to 
correct for this source of error. We implement Angoff 's 
modifications in this study by , performing Delta Plot 
analyses on groups matched on total test score, item p- 
values were computed separately for matched white-black and 
male-female groups, item p-values were then converted to an 
interval scale by determining the normal deviate z-value 
associated with the p-value and transformed to delta values 
with mean 13 standard deviation 4. Delta valuers for each 
pair of comparison groups were plotted for all items. 
Paired dalta values falling some critical distance from the 
plot's principal axis may be regarded as contributing to 
item-by-group interaction (Angoff, 1982) . The perpendicular 



f 2 



DIF Techniques 
11 

distance of the paired Delta values from the principal axis 
line is the bias indax. In this study, items more t>^Jin +1.5 
2-score units from the fixed line are considered to be 
functioning differentially (see Rudner, 1977) . 

Mantel-Ha e nszel Technicme and Prn cpdnrps 

The Mantel-Haenszel statistic (MH; see Holland and 
Thayer, 1986) is based upon two-by-two contingency tables 
for calculated for several total test score categories. 
This sta*:istic is distributed as a chi-square with one 
degree of freedom and is considered by some to be the most 
powerful unbiased test of DIF (Cox, 1970) . For the MH 
technique, a computer program uses scored item responses 
from reference and focal groups as input. The program 
calculates for each item a: (1) Mantel-Haenszel chi-square 
statistic (see Holland & Thayer, 1986, p. 8), and (2) 
difference measure called the pommon odds ratio across two- 
by-two tables (see Holland & Thayer, 1988, p. 134). If the 
MH chi-square statistic is significant, the item is 
considered to be performing differentially for one of the 
compared groups. In addition, if che difference measure is 
greater than one, the item iv, performing differentially in 
favor of the reference group; if it is less than one, then 
it is performing differentially in favor of the focal group. 
The present research uses five test score intervals in 
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calculating the MH indices. Rudner, Getson, and Knight 
(1980) found that MH techniqi^es using five intervals were as 
effective as IRT methods under most conditions. 

Procedures 

The three parameter IRT procedure was implemented in 
the samples of 1000 examinees. Rasch, Delta Plot, and MH 
procedures were implemented in the samples of 2 00, 500, 750, 
and 1000 examinees to investigate the effects of sample size 
on results. Finally, reference-focal group comparisons were 
made for blacks versus whites and males versus females. The 
different DIF techniques were evaluated in terms of 
stability, agreement, and practical and other limitations as 
described below. 

In this study we compare, the IRT, Delta Plot and 
Mantel-Haenszel techniques in terms of their (1) stability 
(concordance of each DIF method with itself • -ross different 
sample sizes) , and (2) agreement (concordance of DIF methods 
with results from the three parameter DIF approach and with 
one another). We evaluate concordance by examining (1) 
correlations between DIF indices, and (2) proportions of 
items identified by pairs of DIF techniques as biased and 
unbiased. We also evaluate these methods according to 
practical and other limitations (e.g., required sample 
sizes, availability of software) . 

ERLC 
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In this study using real test data, there is no single 
criterion for identifying biased and unbiased items. 
However, unlike DIF studies that use simulated data, there 
is no means in this study for pre-identifying biased items. 
Instead, we identify biased items using the three parameter 
IRT technique decribed above and compare results from the 
other three methods to these results. Previous research 
(Shepard et al. , 1985) indicates that using three parameter 
IRT techniques produces superior results in both real and 
simulated test data. 

Results 

Table 1 shows raw score means and standard deviations 
for all groups of examinees and all sample s.izes. Item p- 
values, point biserial correlations, and test reliabilities 
are also presented for all groups and sample sizes. An 
examination of these means indicates that the white students 
scored higher on this test than black students across 
different sample sizes. However, the mean scores for male 
and female students are very similar. Internal consistency 
reliabilities are quite high and similar for the different 
groups and sample sizes. 



Table 1 about here 



DIF Techniques 
14 

The unidimensionality of the MTCS was tested by 
extracting principal components from item correlation 
matrices computed on randomli selected samples of 1000 
students. The proportion of variance accounted for by the 
first principal component for the white group is 19 percent, 
for the black group 18 percent, for male group 2 0 percent, 
and for the female group equaled 19 percent. In all four 
analyses the eigenvalue of the first principal component was 
at least four times as large as the next largest component. 
Thus, the eigenvalue criterion for unidimensionality 
recommended by Reckase (1979) is easily met for the MTCS in 
all comparison groups used for this study, although the 
explained variance criterion is not. 



Results from Each DIF Technique 



Thr !e-Parameter irt DIF Techniq ue 

Graphical analysis. All analyses using the three 
parameter model are based on samples of 1,000 students. 
Table 2 presents item difficulty, discrimination, and 
guessing parameter estimates for three pairs of comparison 
groups: random groups 1 and 2, whites and blacks, and males 
and f. -males. Plots of item difficulties and discrimination 
for these comparison groups appear in Figures 1-6. 
(Reference groups always appear on the Y-axis, focal groups 
on the X-axis) . Correlations of item parameters for each 
pair of groups also appear in each plot. The graphical 
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results suggest that no items on the MTCS are racially or 
sexually biased in terms of difficulty levels. In fact, the 
plots show nearly perfect linear relationships in the race 
and sex comparison group analyses. The correlation 
coefficients for discrimination estimates are not as high; 
the highest of these correlations are .85 in the random 
comparisons, .84 for black-white comparisons, and .68 for 
male-female comparisons. 



Table 2 and Figures 1-6 about here 



ICC dif ferences . Differences between ICCs for pairs of 
groupr was examined by comparing the area between the ICCs 
for the two independent random samples to the area between 
the ICCs for white-black and male-female samples. This 
method takes into account differences between compared 
groups in item difficulty, discrimination, and "guessing" as 
reflected in ICCs. 

The confidence intervals for the absolute differences 
between the two random groups were used as a baseline to 
identify extreme differences in the ICCs found in black- 
white and male-female samples. Any differences not 
contained in these confidence intervals were considered an 
indication of a differentially functioning item. Means, 
standard deviations, and 99 percent confidence intervals for 
differences and their absolute values are reported in the 
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first three columns of Table 3. Only three out of 45 items 

in the white-black comparison are outside of the confidence 

interval and identified as potentially biased against 

blacks. Two items were detected using this procedure as 

potentially biased against females. These five items are 

flagged for review by the bias committee judgmental review. ^ 



Table 3 about here 



Rasch Model PIF Technicfu e 

Graphical analysis. To illustrate the Rasch DIF - 
results graphically, Figures 7-10 depict plots of item 
difficulties for the random, race, and sex comparison groups 
(N=200) . Figures 11-22 depict the similar plots for sample 
sizes of 500, 750, and looo. The plots show nearly perfect 
linear relationship between the groups of examinees, in both 
sex and race analyses. Plots and correlation coefficients 
identify no items to be functioning differentially for race 
or sex subgroups in terms of difficulty level. 



Figures 7-22 about here 
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ICC differences^ The aiea between the two iccs for 
each item for two independent random samples, white and 
black samples, and male and female samples across four 
different sample sizes were also examined. Again, a 
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confidence interval for the absolute differences between two 
random groups was used as a baseline to identify extreme 
differences in the ICCs for the race and sex samples. 
Means, standard deviations, and 99 percent confidence 
intervals for differences and absolute differences are 
reported in columns 4-15 of Table 3. In the sample size of 
1000 four items were identified as potentially biased 
against black students. In the sample size of 750 three 
deviant items were identified, and in the sample sizes of 
500 and 200 only one item was identified as potentially 
biased against blacks (the same item) . No items were 
identified as potentially biased in the sex group 
comparisons in the sample size of 1000. A single item was 
identified as biased against females in the samples 750, 
500, and 200 examinees. 



Delta-Plot Technique 

Item delta plots for black-white and male-female 
samples matched on total score and samples of 200, 500, 750, 
and 1000 are shown in Figures 23-30. item statistics and 
DIF indices for the various comparison groups across sample 
sizes are reported in Tables 4-11 which accompany the plots. 
The last column in each table contains deviations from a 
regression line, referred to as "Bias*." The critical value 
of this index for classifying an item as biased is greater 
than +1.5 z-score units from the line. No items on the 
test appear to be racially or sexually biased in the sample 
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of 200. However, in the sample of lOOO students two items 
appear racially biased, and in the sample of 750 students 
four items appear racially biased. Two of those items were 
the same items found to be biased in the sample of 1000 
students. Two other items appear to be racially biased in 
the sample of 500. Regarding potential sex bias, no items 
on the test appear to be biased in the samples of 200, 500, 
750. However, in the sample of 1000 students, two items 
were identified. 



Figures 23-30 and Tables 4-11 about here 



Mantel-Haenszel Technique 

The MH technique also is performed on representative 
sample of looo, 750, 500, and 200 students. The output from 
the MH analysis includes a chi-square statistic and a 
difference measure. The chi-s^uare statistic is compared to 
a chi-square with one degree of freedom. The difference 
measure indicates the direction of the bias. The results in 
black-white samples show that as the sample size used in an 
analysis was decreased, a pattern developed in the chi- 
square statistics; that is they became smaller and 
identified fewer items as biased. A similar pattern was not 
observed in male-female samples. These results in black- 
white samples show a dependence between the size of the chi- 
square statistic and the sample size used in the analysis. 
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For example, in the black-white sample of 200 o->ily three 
items were identified as biased, while in the sample of 
1000, 34 items were identified as biased. 

Comparisons Acros s Sample Sizes and Techniq ues 
Stability 

Correlations in bl ack-white samples . Table 12 contain 
Spearman rank correlations of each type of DIF index with 
itself for black-white comparisons across different sample 
sizes. In general, correlations were highest between MH 
indices in samples of size 750 and 1000 (r=.74) and size 50 
and 1000 (r=.72). Correlations between Rasch indices in 
samples of 500 and 1000 were :next highest (r=.50). In 
general, stability of MH and Rasch indices in the black- 
white samples is moderate in the largest black-white 
samples. 



Table 12 about here 



Correlations in mc^le-f emale samp lpg. Table 13 
contains similar correlations in male-female samples. In 
this table correlations were highest for the Rasch index in 
samples of size 750 and 1000 (r=.68), followed by 
correlations for Delta Plot indices in samples of 750 and 
1000 (r=.60). In general, stability of the Rasch and Delta 
Plot indices is moderate in the male-female samples. 
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Table 13 about here 



Since we also are interested in whether the same items 
were identified as biased or unbiased in different sample 
sizes, we examined proportions of items identified by each 
DIF technique in the samples of 1000 examinees versus all 
other sample sizes. The results are expressed as a 
proportion of agreement and are summarized in Tables 14 
(white-black) and 15 (male-female) • According to Tabl-es 14 
and 15, proportions of agreement for the Delta Plot and 
Rasch techniques are stable across sample sizes. However, 
the proportions in black-white samples from the MH technique 
show large variability, ranging from a low of 0.22 for the 
sample size of 200 versus loOO to a high of 0.88 for samples 
of size of 750 and 1000. Proportions for male-fsmale 
samples from the MH technique are stable across different 
sample sizes. 



Tables 14 and 15 about here 
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Agreement 

Q. rrelations acro ss dif techniques . Tables 16 and 17 
show correlations between DIF indices from each pair of 
techniques, within each sample size in black-white and male- 
female comparison groups. Correlations between Rasch and 
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Delta Plot indices 0.89 (N=750) and 0.90 (N=500 and 1000) 
and are larger than all other correlations in black-white 
comparisons. Correlations between Rasch and Delta Plot 
indices are .87 (N=500) , .88 (N=1000) , and .90 (N=750) . 
This result is of particular interest for the MTCS sinne its 
items are calibrated within the Rasch model but checked for 
DIF using the Delta Plot method. 



Tables 16 and 17 about here 



The correlation between the three parameter DIF index 
— the criterion for this study— and the Rasch DIF index in 
black-white comparisons for N=1000 is the highest of all 
correlations with the criterion (r=0.54). The correlation 
between the three parameter and Rasch DIF index for N=1000 
in male-female comparisons is also the highest (r=.51). 

Proportions of aareemc.ni- Correlations are only a 
crude measure of how well different techniques agree with 
three-parameter DIF results. We are also interested in the 
accuracy with which the DIF techniques identify biased and 
unbiased items, using items identified by the three 
parameter model as the criterion. We calculated the 
proportion of agreement between items identified by the 
three-parameter DIF index and items identified by each of 
the other three methods. The results are repOi-Led in Tables 
18 and 19. Both proportions of items identified as "biased" 
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and -anbiased" (i.e., total hits) and proportions identified 
as "biased" (true positives) are reported, m terms of 
proportion of total hits, agreement with the three-parameter 
DIF index is highest for the Delta Plot and Rasch techniques 
in both black-white and male-female samples. For the MH 
technique, the proportion of total hits is high in the 
black-white comparison group (N=200). However, for the 
black-white samples of 500 or larger, agreement ranges from 
0.22 to 0.26. Low MH hit rates were not observed in the 
male-female samples. We do not discuss proportions of true 
positives because only three items were identified as 
potentially racially biased by the three parameter 
technique. 



Tables 18 and 19 about here 



Conclusion 

The graphical results indicate that no mtcs items are 
functioning differentially in either black-white or male- 
female comparisons. Plots of item difficulty estimates for 
black-wnite and rnale-female comparisons show nearly perfect 
linear relationships in both groups. The patterns of 
relationships in both race and sex plots are quite similar 
to the relationships in plots of item difficulty estimates 
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(from Rasch and three parameter calibrations) in the two 
independent random samples. Patterns are also similar in 
samples of different sizes. 

Differences between iccs from the three parameter DIF 
technique identified only three items in the race comparison 
samples and two items in the sex comparison samples that 
appear to be functioning differentially, since confidence 
intervals around absolute mean differences were used to 
identify these items, there is a small probability of 
erroneously detecting items as biased. Because of the 
potential for false positive errors, and because it can be 
instructive to identify the features of items that may cause 
them to function differently in different groups, these 
items are resubmitted for further judgmental review but are 
not excluded from test scores. 

Stability in black-white samples, as indicated by rank 
order correlations of the same DIF indices in samples of 
different sizes, is low tc moderate for the Delta Plot and 
Rasch DIF techniques. MH stability is high in comparisons 
of large samples. Stability in male-female samples, as 
indicated by rank order correlations of the same DIF 
indices, is moderate for the Rasch, Delta Plot, and MH in 
large samples. 
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Proportions of agreement for the Delta Plot and Rasch 
techniques are stable across sample sizes in white-black and 
male-female samples. However, agreement proportions from 
the MH technique in black-white samples show large 
variablity (i.e , are not stable). Proportion of agreement 
from the MH technique for male-female samples are stable in 
different sample sizes. 

Agreement, as indicated by rank order correlations 
across DIF techniques, is very high between Rasch and Delta 
Plot DIF indices for all sample sizes in both black-white 
and male-female comparisons. This result is of particular 
interest since MTCS items are calibrated using the Rasch 
model but checked for DIF using the Delta Plot method. The 
Rasch index agrees moderately with the three-parameter DIF 
index; agreement of other techniques with the three 
parameter DIF index is low. 

In terms of agreement regarding biased and -nbiased- 
items, agreement with the three-parameter DIF index is 
highest for the Delta Plot and Rasch techniques in both 
black-white and male-female samples. For the MH technique, 
the proportion of total hits is high in the black-white 
comparison group of sample size 200. However, for black- 
white samples of 500 or larger, agreement is low. These 
findings regarding stability and agreement in real test data 
partly support previously published research. Harris and 
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Kolen (1986) Harris and Hoover (1986), and Skaggs and 
Lissitz (1988) have reported that different DIF techniques 
do not agree very well with each other and are only 
moderately stable across different sample sizes. However, 
these studies have relied mostly on correlations between DIF 
indices to indicate agreement (cf. Skaggs & Lissitz, 1988); 
in this study we have reported both correlations and 
proportions of agreement. 

A pattern in chi-square statistics is evident in 
results from the MH analyses of black-white samples in these 
data. As the number of response patterns was decreased in 
white-black samples, chi-square statistics became smaller 
and identified fewer items as biased. This pattern suggests 
a dependence between the siz^ of chi-square statistics and 
sample sizes used in the MH analyses. 

We are aware that with the large examinee samples 
available in statewide testing, chi-square significance 
tests for item by group interactions using traditional alphr 
levels may be sensitive to item functioning differences 
which have no practical importance. In the data used for 
this study, few non-significant item by group interactions 
were found in large samples by the MH technique. Proper 
adjustment of significance levels, so that only practically 
significant degrees of bias are flagged, requires 
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considerations of effect size and power. However, no 
attempt was made for adjusting the significance levels in 
this study. 

It should to be mentioned that in previously published 
research item parameters were estimated using the LOGIST 
program, while in the present study PC-BILOG was used for 
three parameter item estimates. BILOG implements marginal 
maximum likelihood estimation procedures which produce more 
stable estimates across subgrov : than other estimation 
procedures (see Baghi, 1988). 
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Table 1 



Descriptive statistics For All Student Subsamples 



Samples 



N 



Mean Median 



SD 


P 


rpbis 


KR-20 


7.49 
8.34 
8.21 
8.58 


0.73 
0.74 
0.73 
0.74 


0.39 
0.42 
0. 42 
0.40 


0.84 
0.85 
0.85 
0.86 


8.35 
8.47 
8.53 
8.48 


0.74 
0.75 
0.74 
0.74 


0.45 
0.39 
0. 42 
0.45 


0.85 
0.85 
0.86 
0.86 


7.43 
7.81 
7.18 
7.49 


0.80 
0.78 
0.78 
0.78 


0.39 
0.42 
0.46 
0.41 


0.82 
0.83 
0.82 
0.83 


8.61 
8.04 
8.45 
8.18 


0.69 
0. 68 
0.69 
0.69 


0.44 
0.45 
0.42 
0.39 


0.87 
0.85 
0.86 
0.86 


9.06 
8.82 
8.23 
8.54 


0.74 
0.74 
0.76 
0.75 


0.43 
0.41 
0.44 
0.44 


0.86 
0.86 
0.84 
0.85 


7.63 
7.86 
7. ,85 
7.84 


0.73 
0.75 
0.75 
0.75 


0.40 
0.42 
0.45 
0.41 


0.84 
0.84 
0.84 
0.84 



Random 1 
Random 2 
Random 3 
Random 4 

Random 5 

Random 6 

Random 7 

Random 8 

White 1 
White 2 
White 3 
White 4 

Black 1 
Black 2 
Black 3 
Black 4 

Male 1 
Male 2 
Male 3 
Male 4 

Female 1 
Female 2 
Female 3 
Female 4 



200 
200 
500 
500 

750 
750 
1000 
1000 

200 
500 
750 
1000 

200 
500 
750 
1000 

200 
500 
750 
1000 

200 
500 
750 
1000 



32.87 
33.49 
32.97 
33.49 

33.49 
33.59 
33.09 
33.28 

35.84 
35.23 
35.05 
35.12 

30.91 
30.79 
31.07 
31.04 

33.51 
33.26 
34.14 
33.75 

33.04 
33.97 
33. 69 
33.71 



34.00 
35.00 
34.00 
36.00 

36.00 
36.00 
35.00 
35.00 

38.00 
37.00 
37.00 
37.00 

32.00 
31.00 
32.00 
32.00 

36.00 
35.00 
36.00 
36.00 

33.00 
35.00 
35.00 
35.00 
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Table 2 

Item Parameter Estimates For Selected Student Subsamples (fIslOOO) 



Randomi 
Item # a b 



White ;ia"ck Hai; p;;;':" 



c a b c a b 



c a b c a b 



cab 



3 0.958 -1.125 0.277 0.89« -1 jo« 0 2 « o:788 .? 60 S1?5 o1 ^ nl I'T. ""^'^ O-'*? 0.096 

« 0.811 0.292 0.2«2 0.937 0 «77 0 278 0 qqu niRi c n o*n "l'JJ5 0.861 -l.qsi 0.222 1.008 -l.uuu 0.080 

5 0.q68 O.qox 0.220 0 0>03 oloi IVl Smo I'm n'^TP nifln V^. ""BCe 0.652 -C.03T O.l^j 

6 l.Wq-C.326 0.337 1.1«2 -0 519 0 29 ?1o7 -S1o2 \fn o'n^S nl o ^n^'^ ""^"^ """^^ ""^"^ C-IC^ 

7 0.818 -2.008 0.317 0.973 -1.863 0 ^50 0 93= -2 265 Im 0 8^2 I'tI n',^ ? nlL^ l'^?" 1.081-0.7C2 0.1^3 

8 0.730 0.310 0.153 0.668 0.099 0 112 0 80''. 0 20^ n?«^ nR« 'M^n I'^Xl 0-866-2.219 0.282 0.923 -1.82? C.K5 

9 1.'<69 0.035 0.16« 1.617 O.IW 0 2 0 l"«17 V^l V^. n1 n n" L° O'^SO -0-126 0.112 0.7'<6 O.lCl 0.0=6 
10 1 .708 -0.153 O.qqi l.«83 -0.276 0 «18 157 iS "687 S"^0« "p?? n'n« I'V^ '-529 -0.267 0.200 1.386 0.068 0.1:3 
'1 1.310 0.305 0.175 0.975 0 « 9 0.168 O^ S'S o" 7^ 'n'^o^ I'^l .31« -0.580 0.381 1.027 -0.561 0.1C3 

12 1.376 -0.156 0.««2 1.322 -0.171 0 «7« 026 oqr! n'lL n'-^flo n1L ^'^^^ °-''0a 0.220 0.986 0.169 0.095 

13 0.776 -0.393 0.223 0.7.'« -0 525 o! 5' oVgs 628 o'lm ?'oo? 'n'n.^^ n^'JS! 0.882 -0.759 0.26« 0.819 -0.920 0.1C9 
I" 1.'<16 0.551 0.256 I.15O 0 551 0 261 0 97^1153 o1S^ "^Sl 'n'Rot n'i?? 0.677 -0.5?'' 0.187 0.691 -0.555 0.08i 

15 0.933 0.«88 0.2«'J 0 702 0.U28 0 230 1 026 O^^Q opto h'toJ I'V-X l'^^^ ^'^^^ 0.''23 0.288 0.752 0.296 0.153 

16 0.803 -1.061 0.279 0.883 -i 1^6 ojfs o"822 -^'fn I II ^'^Jq n'-II l'^^. I'f^ W ""^-^ Oo;5 0.125 

17 1.009 -0.608 0.292 1 . 190 -0.'<6'< 0 389 0 837 - 106 o1q2 o'ts? 'n'^^fl n'^I^ 0.79'.- -1 . 381 0.250 0.928 -1.135 0. iC^ 

18 0.80'< -0.53" 0.188 0.796 -0.668 0 85 0 780 -0 930 SI18 Oktl o^lt n'^nl n'2 I -"'^'^ "•'"^ 1.066 -0.862 C.C?^ 

19 1.209 0.588 0.1«8 0.907 O.H5 0 51 0 750 0 2^° o"l6l oJl ^'Jn 0.8"1 -0.577 0.222 0.917 -0.6=2 0.IC3 

20 1.230 0.120 0.280 1.200 0.265 0.208 0 89S -S"o9 S"21 ?f-2 17, ni^^^n'o^l" °--5" "-"'O C.l;8 

21 0.606 -1. 1.58 0.201 0.552 -1.397 0 236 0 «08 2 0«^ n til n ntl , ^^^1 0.920 -0.07" 0.158 1.199 0.291 0.198 

22 1.173 -1.196 0.26'< 1 253 -1 ^8 0181 ll?^ -il 3 oill ^1>ll AV. n'^^l 0.629 -1.331 0.233 0.638 -1.389 0.1C7 

23 0.589 -1.889 0.206 0.596 -2.036 0 242 0«65 I^l 8 oHl ofnl ^'M l^l^-^'H- 1-505 -Ll-'C 0.0^9 
2'i 0.98') -2. '.06 0.202 0.919 -2.756 0 228 0 6«0 .3 67« oHl o'qpt 1' nlll 0-56" -1.989 0.187 0.7C8 -1.909 C :C6 

25 0.830 -1 .315 0.2m 0.682 -1 «97 0 I99 0 683 2 077 o'?qq All 'vlT^ °-s30 -2.3C5 0.213 1. '.33 -2.020 C.O89 

26 0.87'. -0.781 0.152 0.957 -0.70 0 168 0 786 II" 367 o Ml o'llr r^'-ll nlV. 0.858 -1.61.. 0.217 0.961 -1.252 0.096 

27 0.780 -1.9'.2 0.187 0.65" -2 78 0 220 0 846 2^00 n III n"?n '^o'^I 0.837 -1.193 0.153 0.977 -0.797 O.C93 

28 1 .356 -1.821 O.I78 1 378 -1 .811 oiige ?"o 8 :2"«13 0 2^7 ofn -.^11 ninl 9-666 -2.193 0.212 1 .093 -1.808 0.09^ 

29 o.ggg -1.101 0.221 1.055 -1 101 0196 0966 -15 6 0I 8 o'ofifi i'St? n"^?^ ^ "^-Ois 0.169 1. 2.'2 -1.775 0.036 

30 0.7'.'. -1.227 0.172 0.791 -1 08'. 0 20q 07nu filfi np n nl^o i"".^ 0.877 -1.301 0.2'JO 1 .105 -1.153 CCSU 

31 0.955 -l.gge 0.201 0 6O -2:So2 O.l^l S"68 :2"60S 0"2 6 ^l ? "?'L°n ^'-S^ 0.890 -1.115 0.162 0.919 -1.067 0.091 

32 0.75'. -1.015 0.176 0.667 -0.93" 0 238 0 92" Il'lSS ol 5 o"6?n "Vrp^ n"pnn n'^^?"^-''"'* 0.879 -1.796 0.095 

33 0.993 -1.6"2 0.185 0.831 -1.8 1 0.178 0 9«8 -2 "? Sloo S%S n'p ^Atyi^^"* 0.812 -1.055 C.095 
3« 0.555 -0.017 0.199 0."g7 0.0"5 0 257 0 610 -0 "^2 Olfin nipfl "n'^o^ 1.081 -1.706 0.171 1.277 -1. ""8 0.08' 

35 0.783 -0.762 0.188 0.87" -0 52g 0 ig8 0g72 ol^l 0?tf, ?"n^n n-]V-, S'^c? 0.753 -0.052 0.250 0.578 -0.18" C.Cc? 

36 l."10 -0.868 0 203 1.736 -0.631 0 2g« llgq S"g56 ol66 \ f,7o'°r,-,ll °.iV 0.905 -0.522 0.227 0.902 -0.93" 0.CO6 

37 0.988 -0.318 0.173 1.292 0 005 0.267 "7 loll^ o f 2^ 1n? "n'nnS 1.371 -0.998 0.160 1.682 -0.705 0.09" 

38 1.020 -0.72" 0.2"3 I.169 -0.629 0 28^ " 2 iS'gl^ Sigo I'lul oHl .160-0."01 0.216 1.266 -0.32" 0.129 

39 0.886 -0.716 0.2"1 1.0g3 -0.530 0 251 08I6 OPQ^ o"?n^ n'o^R "n'L°, ^Jrl '-258 -0.790 0.25" 1.15" -0.783 0.CQ8 
"0 0.905 -0.335 0.155 1.1"5-0 1"" 0 233 1I 1 0IL nlVn Au, 0.868 -0.758 0.192 0. 982 -0. 838 O.CS' 
11 1.580 -0.0"9 0.201 1 318-0.102 02" "^S -'S^IS o'lV^ IHI ]-07" -0.282 O.I8O 1.021-0.3.^ 0.12^ 
"2 1.375 -0."56 0.233 1.:98 -O 57o 0 220 "99 S"735 SIs? Ik n'-ol -^08 -0.087 0.207 1.173 -0.:33 0.CS2 
"3 1.377 -0.189 0.095 1.1"8 -0.282 0 088 352 iS" 260 ofsl A\l n^f, n"?!^ -518 -0.736 0.208 l."72-0.6'.8 0.0=1 
"f 0.619 -0.219 0.262 0.585 -0 311 0 I99 0=8" 0"7? o"?«I I'Ui nrll 1.35"-0.1"1 0.158 1.116 -0.237 0.079 

Note, a z discrimination; b = difficulty level; c /euessing" 



Table 3 



KlUiW Sub^ples SaMon 1 .s. tmdooi 2, While „. Black, Hale ,s. F«ale. 

2:^EEE:^IIZZ^"'^ ss"o-;:si Sa-s^;:;™; ;a":=T™:;s5 — 

-™-.~:?--±L._."lf ... """"i'-K'"";:; ;:f — ^:r--;u;r~\-;i — 5- 

Hean .r D,rr. .o.o,«. 0.0,20, 0.53.7 o.o«53 «.-5""i:;«T7-oi;;n:w:i:;sn=r;:^^^^^^^ 

-Of 1„. 

• • ° tlSS IS -S:Sa' tJS ;°3S ""oSJ Iffi^l US "SIS? -S11?|! 

*a„ «r»a.»irr. 0..,.3 o.,«.. o.«.. „.3„. „ „„„ ^ ^-^^^^ ^-^^^^^ ; 

Oor...0,„. 0.0555, 0.0W, 0.,«e o.,8S5 0.,«, o.,.„e o.,„02 0.,5,32 ..,58,. 0.0,5, 0.,„e, 0.,.056 0.066,, 0.„e6, 0,2,5. 

--^■^^■^-l^ 1:S 1:?S? -S:K 1.S tgl tSS 1:^5 tJS "S.S tS t'^Ji^^ 
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tIGURE 1 Plot of Item Difficulty Estimates from the Three-Parameter 
Model in Two Random Samples. (N=1000) 



K 
A 

N 

2 



0* 



-2* 



5 

.3* 



12 



1 1 
121 



1 

1 1 
1 31 
1 1 
13 nil 
1 



1 1 
2 11 
1 1 

1 



1 1 1 
1 
1 

1 



-4* 



r=0.969 



-3.325 -2.375 -1.425 -.475 .475 i.425 

-3.S -2.85 -1.9 -.95 0 .95 1.9 



RANI 



FIGURE 2 Plot Of Item Discrimination Estimates from the Three-Parameter 
Model in Two Random Samples. (N=1000) 
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FIGURE 3 Plot cf Item Difficulty Estimates from the Three- Parameter 
Model In Black and White Samples, (N=1000) 
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FIGURE 4 Plot of Item Discrimination Estimates f rom t he Three-Parameter 



Model in Black and White Samples. (N=1000) 
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in Two Random Samples. (N=200) 
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FIGURE 13 Plot of Item Difficulty Estimates from the Rasch Model 
in Two Random Samples. (N=500) 
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FIGUREJ2 Plot Of Item Difficulty Estimates from the Rasch Model 
in Two Random Samples. (N=750) 
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FIGURE 19 Plot of Item Difficulty Estimates from the Rasch Model 
in Two Random Samples. (N=1000) 
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FIGURE 20 Plot of Item Difficulty Estimates from the Rasch Model 
in Black and White Samples. (N=1000) 
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FIGURE 21 Plot of Item Difficulty Estimates from the Rasch Model 
in Two Random Samples. (N=1000) 
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Item .statistics and DIP Indices for the Delta 
Plot Method: White VS Black. (N - 200) 
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FIGURE 23 Plot of Delta Values for White 
and Black Samples. (Ns200) 
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Item Statistics and DIF Indices for the Delta 
Plot Method: white VS Black, (N « 500) 
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FIGURE 24 Plot of Delta Values for White 
and Black Samples. (N=50o) 
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TABLE 6 



Iteiu Statistics and DIF Indices for t^:e Delta 
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. 00 
.88 


.81 


1 .23 


.71 


.95 


.55 


.60 


.81 


.25 


.43 


.05 


• . 18 


.49 


.36 


- 03 


.77 


1.08 


. 74 


.78 


1.41 


. 77 


.86 


1.17 


1 .08 


.95 


1.88 


1.65 


.79 


1.28 


.81 


.70 


.95 


.52 


.87 


1.55 


1.13 


.90 


1.88 


:.28 


.80 


1.08 


.84 


.73 


1.08 


.61 


.90 


1 .65 


1.28 


.76 


.95 


.71 


.88 


1.48 


1.17 


.58 


.47 


.20 


.67 


.81 


.44 


.75 


1.13 


.67 


.60 


.50 


.25 


.72 


.99 


.58 


.6b 


.81 


.47 


.60 


.55 


.25 


.57 


.44 


.18 


.69 


.92 


.50 


.54 


.41 


. 10 


.63 


.52 


.33 


.80 


.99 


.84 



20.00 
13.60 
18.12 
14.12 
13.80 
16.90 
20. JO 
14.00 
14.76 
18.52 

16.08 

15.44 

13.80 

13.80 

17.92 

16.80 

16.24 

13.20 

14.44 

17.32 

18.64 

17.68 

20.52 

18.12 

16.80 

19.20 

20.52 

17.32 

17.32 

19.60 

16.80 

18.92 

14.88 

16.24 

17.52 

15.00 

16.96 

16.24 

15.20 

14.76 

16.58 

14.64 

15.08 

13.96 



18.36 

13.20 

16.52 

13.40 

13. 12 

15.00 

17.92 

12.88 

13.20 

15.68 

12.58 

15.^4 

14.12 

12.88 

13.00 

16.52 

15.20 

14.00 

12.28 

12.88 

15.96 

16.08 

17.32 

19.60 

16.24 

15.08 

17.52 

18.12 

16.36 

15.44 

18.1? 

15.84 

17.68 

13.80 

14.76 

15.68 

14.00 

15.32 

14.88 

14.00 

13.72 

15.00 

13.40 

14.32 

16.36 



SIAS 



.05 
.42 
- .09 
.72 
U2 
-.35 
-.28 
-.09 
-.35 
.34 
.34 
.75 
-.11 
.04 
.13 
.05 
-.20 
-.73 
-.01 
-.38 
.02 
-.78 
.81 
.64 
.30 
.29 
.05 
.47 
.32 
.37 
. 14 
.28 
.26 
.02 
. 16 
.32 
ng 

.22 
.07 
.04 
.04 
.27 
. 12 
.28 
.56 



FIGURE 25 Plot of Delta Values for White 
and Black Samples. (N=750) 
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not or DELTA VALUES. HErE«ENCE VS rOCAL C.^Our 



\ 

1 1 
1 



1 

1 3 
2 11 1 
9 111 
1 1 
111 1 
1 

3 1 



3 2 

4 1 
1 



1 I 



10 



12 



14 



16 



II 



20 



22 roc 



TABLE 7 



Item Statistics and DIP Indices for the Delta 
Plot Method: White VS Black, (N 



1000) 



FIGURE 26 Plot of Delta Values for White 
ana Black Samples. (N=1000) 



NUMBER 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 



P-VALUE 
REF FOC 



Z-VALUE 
REF FOC 



DELTA 
REF FOC 



.95 

.57 

.89 

.61 

.56 

.82 

.95 

.59 

.67 

.80 

.54 

.79 

.73 

.60 

.58 

.89 

. 84 

.78 

.54 

.63 

.84 

.91 

.89 

.98 

.90 

.83 

.94 

.96 

.88 

.86 

.94 

.83 

.94 

.66 

.78 

.85 

.71 

.84 

.79 

.73 

.70 

.81 

.67 

.70 

.85 



.92 

.52 

.81 

.53 

.51 

.69 

.90 

.48 

.53 

.74 

.46 

.76 

.61 

.42 

.50 

.78 

.71 

.63 

.40 

.47 

.78 

.81 

.87 

.95 

.78 

.69 

.88 

.92 

.79 

.75 

.91 

.75 

.89 

.58 

.67 

.77 

.62 

.74 

.69 

.58 

.59 

.71 

.56 

.61 

.79 



1.65 
.18 

1.23 
.28 
.15 
.92 

1.65 
.23 
.44 
.84 
.10 
.81 
.61 
.25 
.20 

1.23 
.99 
.77 
.10 
.33 
.99 

1.34 

1.23 
05 



1 .28 
.95 
1.55 
1.75 
1.17 
1 .08 
1.55 
.95 
1.55 
.41 
.77 
1.04 
.55 
.99 
.81 
.61 
.52 
. 88 
.44 
.52 
1.04 



1.41 
.05 
.88 
.08 
.03 
.50 
1 .28 
-.05 
.08 
.64 
-.10 
.71 
.28 
-.20 
.00 
.77 
.55 
.33 
-.25 
-.08 
.77 
.88 
1.13 
1.65 
.77 
.50 
1.17 
1.41 
.81 
.67 
1.34 
.67 
1.23 
.20 
.44 
.74 
.30 
.64 
.50 
.20 
.23 
.55 
.15 
.29 
.81 



19.60 

13.72 

17.92 

14.12 

13.60 

16.68 

19.60 

13.92 

14.76 

16.36 

13.40 

16.24 

15.44 

14.00 

13.80 

17.92 

18.96 

16.08 

13.4a 

14.32 

16.96 

18.36 

17.92 

21.20 

18.12 

16.80 

19.20 

20.00 

17.68 

17.32 

19.20 

16. 80 

19.20 

14.64 

i6.08 

17.16 

15.20 

16.96 

16.24 

15.44 

15.08 

16.52 

14.76 

15.08 

17.16 



18.64 

13.20 

16.52 

13.32 

13.12 

15.00 

18.12 

12. 80 

13.32 

15.56 

12.60 

15.84 

14.12 

12 20 

13.00 

16.08 

15.20 

14.32 

12.00 

12.68 

16.08 

16.52 

17.52 

19.60 

16.08 

15.00 

17.68 

18.64 

16.24 

15.68 

18.36 

15.68 

17 .92 

13.80 

14.76 

15.96 

14.20 

15.56 

15.00 

13.80 

13.92 

^5.20 

13.60 

14.12 

16.24 



BIAS* 



.30 
.44 

-.07 
.25 
.47 

-.31 

-.07 
.01 

-.19 
.32 
.23 
.60 

-.08 

-.47 
.24 

-.38 

-.36 

-.38 

-.20 

-.35 
.28 

-.37 
.66 

-.11 

-.52 

-.39 

-.11 
.03 

.10 

.26 

.38 

.10 

.06 

.24 

.06 

.06 

.14 

.10 

.00 

.32 • 

.02 

.05 

.01 . 

.16 

.26 



27\ 



201 



It 



^LOT Of OtLTA VALUES. RtFCUCNCt VS FOCAL OHOUP. 



1 

1 1 
1 1 1 



16 



14 



12 



10 



2 1 1 
1 1 
2 2 3 
2 1 
2 1 1 



2 
3 

1 2 I 
1 2 1 
1 1 2 



• 10 12 14 



18 II 20 



fOC 



ERIC 



^7 



TABLE 8 



Item Statistics and DIP Indices for the Delta 
Plot Method: Male VS Female. (N « 200) 



FIGURE 27 Plot of Delta Values for Male 
and Female Samples* (N=200) 



NUMBER 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 



P-VALUE 
REF FOC 



2-VALUE 
REF FOC 



DELTA 
REF FOC 



.92 

.60 

.84 

.58 

.54 

.79 

.94 

.60 

.66 

.77 

.51 

.77 

.65 

.56 

.60 

.83 

.81 

.Sn 

.55 

.57 

.80 

.85 

.85 

.96 

.84 

.77 

.87 

.90 

.82 

.82 

.91 

.80 

.88 

.61 

.74 

.81 

.69 

.80 

.72 

.69 

.65 

.78 

.63 

.70 

.83 



.93 
.51 
.87 
.58 
.53 
.75 
.91 
.53 
.54 
.76 
.50 
.79 
.66 
.53 
.54 
.87 
.77 
.78 
.42 
.57 
.80 
.89 
.91 
1.00 
.84 
.72 
.95 
.96 
.88 
.80 
.93 
.79 
.90 
.57 
.71 
.80 
.61 
.76 
.73 
.60 
.59 
.77 
.60 
.73 
.84 



1.41 
.25 
.99 
.20 
.10 
.81 
1.55 
.25 
.41 
.74 
.03 
.74 
.39 
.15 
.25 
.95 
.88 
.50 
.13 
.18 
.84 
1 .04 
1 .04 
1.75 
.99 
.74 
1.13 
1.28 
.92 
.92 
1.34 
.84 
1.17 
.28 
.64 
.88 
.50 
.84 
.58 
.50 
.39 
.77 
.33 
.52 
.95 



1 .48 
.03 

1.13 
.20 
.08 
.67 

1.34 
.08 
.10 
.71 
.00 
.81 
.41 
.08 
.10 

1.13 
.74 
.77 

-.20 
.18 
.84 

1 .23 

1.34 

3.00 
.99 
.58 

1.65 

1.75 

1.17 
. 84 

1.48 
.81 

1.28 
.18 
.55 
.84 
.28 
.71 
.61 
.25 
.23 
.74 
.25 
.61 
.99 



18.64 

14.00 

16.96 

13.80 

13.40 

16.24 

19.20 

14.00 

14.64 

15.96 

13.12 

15.96 

14.56 

13.60 

14.00 

16.80 

16.52 

15.00 

13.52 

13.72 

16.36 

17.16 

17.16 

20.00 

16.96 

15.96 

17.52 

18.12 

16.68 

16.68 

18.36 

16.36 

17.68 

14.12 

15.56 

16.52 

15.00 

16.36 

15.32 

15.00 

14.56 

18.08 

14.32 

15.08 

16.80 



18.92 

13.12 

17.52 

13.80 

13.32 

15.68 

18.36 

13.32 

13.40 

15.84 

13.00 

16.24 

14.64 

13.32 

13.^0 

17.52 

15.96 

16.08 

12.20 

13.7? 

16.36 

17.9. 

18.36 

25.00 

16.96 

15.32 

19.60 

20.00 

17.68 

16.36 

18.92 

16.24 

18.12 

13.72 

15.20 

16.36 

14.12 

15.84 

15.44 

14.00 

13.92 

15.96 

14.00 

15.44 

16.96 



BIAS* 



•1. 



66 
03 
04 
52 
58 
46 
44 
09 
-.40 
-.14 
.64 
.Olf 
.35 
.42 
.13 
.09 
-.54 
.79 
-.14 
.54 
-.19 
.02 
.26 
1.59 
-.35 
-.43 
.65 
.37 
.28 
-.45 
-.43 
-.25 
-.31 
.21 
-.17 
-.32 
-.30 
-.48 
.17 
-.37 
-.05 
-.18 
.20 
.37 
-.22 



221 



201 



111 



f 



I 

141 

I 

I 
I 
I 
I 
I 

"I 

I 

I 

I 
I 

101 
I 
I 

I 
I 
I 
I 



HOT OF DELTA VALUES. HEFEHEHCE VS rOCAL GROUP. 



1 1 
2 2 11 
3 3 1 
2 3 
1 1 
2 1 1 
1 2 1 
1 3 1 
3 

1 



10 



12 



20 



22 roc 



59 



TABLE 9 



Item Statistics and DIF Indices for the Delta 
Plot Method: Male VS Female, (N « 500) 



FIGURE 28 Plot of Delta Values for Male 
and Female Samples, (N=500) 



NUMBER 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 



P-VALUE 
REF FOC 



Z-VALUE 
REF 



DELTA 



.94 

.59 

.83 

.59 

.55 

.77 

.94 

.57 

.64 

.75 

.50 

.72 

.67 

.54 

.b6 

.84 

.76 

.69 

.47 

.58 

.80 

.82 

.83 

.94 

.84 

.78 

.88 

.91 

.79 

.77 

.91 

.78 

.90 

.62 

.71 

.77 

.65 

.77 

.71 

.62 

.61 

.79 

.63 

.66 

.80 



.92 

.51 

.84 

.58 

.51 

. 73 

.90 

.53 

.54 

.76 

.52 

.76 

.66 

.52 

.52 

.79 

.78 

.73 

.49 

.58 

.82 

.83 

.86 

.93 

.83 

. 72 

.92 

.91 

.82 

.78 

.91 

.78 

.88 

.61 

.74 

.80 

.71 

.76 

.77 

.68 

.64 

.73 

.62 

.65 

.84 



1.55 
.23 
.95 
.23 
. 13 
.74 

1.55 
.18 
.36 
.67 
.00 
.58 
.44 
.10 
.15 
.99 
.71 
.50 
-.08 
.20 
.84 
.92 
.95 
1.55 
.99 
.77 
1.17 
1.34 
.81 

.74 
1.34 

.77 
1.28 

.30 

.55 

.74 

.39 

.74 

.55 

.30 

.28 

.81 

.33 

.41 

.84 





FOC 


RFF 


fnc 


1.41 


A 9 , CM 


1 fi RA 




.03 


AO , 9C 


11 19 




.99 


1 R tn 


10.90 




.20 


11 09 


13.80 




.03 


11 

A<3 . ^ C 


11 19 




.61 


1 OK. 
13.90 


1 C MA 

19.44 


1.28 




10 11 

1 0 . 1 c 




n ■ 

. U0 


11 7 9 


11 11 




1 n 


14 . •?4 


1 1 An 
13.40 




7 1 
. / I 


1 eft 

1 9 . Do 


1 C ^ M 




nc 

. 


1 1 nn 


13.20 




. 7 1 


15 . 3£ 


15.84 




. 41 




t A CM 

14.64 




. 05 


1 1 An 


11 in 
1 3 . 2 U 




. 05 


11 (tn 


11 in 




. ol 




1 R 'J A 

10 . c 4 




77 


15.84 


1 ft n ft 

1 0 . U O 




.61 


15 . 00 


1 >^ A A 




.03 


1 ? ' fiR 


19 11 R 

aC . Q 0 




.20 


13 . 80 


11 fin 

1 J . 0 u 




.92 


16 ! 36 


1 ft ft ft 




.95 


16.68 


Av . 9\J 


1 


.08 


16 ! 80 


17 17 


1 


.48 


19 . 20 


18 Q9 

A 0 . it C 




.95 


16 96 


1 ft sn 




.58 


16!08 


15.32 


1 


.41 


17.68 


18.64 


1 


.34 


13.36 


18.36 




.92 


16.24 


16.68 




.77 


15.96 


16.08 


1 


.34 


18.36 


18.36 




.77 


16.08 


16.08 


1 


17 


18.12 


17.68 




28 


14.20 


14.12 




64 


15.20 


15.56 




84 


15.96 


16.36 




55 


14.56 


15.20 




71 


15.96 


15.84 




74 


15.20 


15.96 




47 


14.20 


14.88 




36 


14.12 


14.44 




61 


16.24 


15.44 




30 


14.32 


14.20 




39 


14.64 


14.56 




99 


16.36 


16.96 



BIAS* 



-.40 
-.54 
.12 
-.06 
-.25 
-.35 
-.77 
-.25 
-.71 
.13 
.17 
.38 
-.06 
-.11 
-.25 
-.50 
.18 
.33 
.18 
.03 
.24 
.09 
.38 
.20 
.10 
.52 
.68 
.00 
.32 
. 10 
.00 
.01 
.31 
.03 
.27 
.30 
.47 
.07 
.55 
.50 
.25 
.55 
.06 
.03 
.43 



RCF 
221 
I 
I 
I 
I 
I 
I 

201 
I 
1 
I 
I 
I 
I 
I 

II L 

I 
1 
I 
I 
1 
I 
1 

HI 

I 
I 
I 

r 
I 

I 

141 
I 
I 
I 
I 
I 
I 
I 

121 
1 
I 
I 
I 
I 
I 
I 

101 

I 
1 
I 
I 
I 

I 

tl 



PLOT OF DELTA VALUES. KEFEKENCE VS. FOCAL CKOUF. 



1 1 1 
2 



1 2 I 
i 2 
3 4 2 
7 

2 1 



112 1 
1 3 11 
3 1 
1 
1 



10 



12 



14 



16 



it 



70 



22 FOC 



ERLC 



TABLE 10 



Item Statistics and DIF Indices for the 
Plot Method: Male VS Female. (N = 750) 



Delta 



^ ^wmrt^r-'^ ^ ^ 



FIGURE 29 Plot of Delta Values for Male 
and Female Samples. (N=750) 



NUMBER 



P-VALUE 
REF FOC 



Z-VALUE 
REF FOC 



DELTA 
REF FOC 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
IS 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 



.93 

.59 

.84 

.59 

.56 

.78 

.91 

.55 

.67 

.77 

.50 

.76 

.71 

.53 

.56 

.83 

.78 

.69 

.49 

.58 

.80 

.83 

.84 

.92 

.84 

.78 

.87 

.90 

.82 

.80 

.89 

.79 

.88 

.62 

.73 

.81 

.67 

.78 

.73 

.65 

.65 

.78 

.63 

.64 

.80 



.89 

.52 

.84 

.59 

.53 

.72 

.88 

.47 

.56 

.76 

.52 

. 74 

.67 

.49 

.49 

.80 

.77 

.74 

.44 

.50 

.80 

.84 

.86 

.93 

.81 

.70 

.89 

.89 

.81 

.78 

.87 

.77 

.85 

.58 

.73 

.78 

.64 

.75 

.73 

.67 

.64 

.73 

.60 

.65 

.79 



1.48 

.23 
.99 
.23 
.15 
.77 

1.34 
.13 
.44 
.74 
.00 
.71 
.55 
.08 
.15 
.95 
.77 
.50 

-.03 
.20 
.84 
.95 
.99 

1.41 
.99 
.77 
1.13 
1.28 
.92 

.84 
1.23 

.81 
X.il 

.30 

.61 

.88 

.44 

.77 

.61 

.39 

.39 

.77 

.33 

.36 

.84 



1.23 
.05 
.99 
.23 
.08 
.58 

1.17 

-.08 
.15 
.71 
.05 
.64 
.44 

-.03 

-.03 
.84 
.74 
.64 

-.15 
.00 
.84 
.99 
1.08 
1.48 

.88 

.52 
1.23 
1.23 

.88 

.77 
1.13 

.74 
1.04 

.20 

.61 

.77 

.36 

.67 

.61 

.44 

.36 

.61 

.25 

.39 

.81 



18.92 

13.92 

16.96 

13.92 

13.60 

16.08 

18.36 

13.52 

14.76 

15.96 

13.00 

15.84 

15.20 

13.32 

13.60 

16.80 

16.08 

15.00 

12.88 

13.80 

16.36 

16.80 

16.96 

18.64 

16.96 

16.08 

17.52 

18.12 

16.68 

16.36 

17.92 

16.24 

17.68 

14.20 

15.44 

16.52 

14.76 

16.08 

15.44 

14.56 

14.56 

16.08 

14.32 

14.44 

16.36 



17.92 

13.20 

16.96 

13.92 

13.32 

15.32 

17.68 

12.68 

13.60 

15.84 

13.20 

15.56 

14.76 

12.88 

12.88 

16.36 

15.96 

15.56 

12.40 

13.00 

16.36 

16.96 

17.32 

18.92 

16.52 

15.08 

17.92 

17.92 

16.52 

16.08 

17.52 

15.96 

17.16 

13.80 

15.44 

16.08 

14.44 

15.68 

15.44 

14.76 

14.44 

15.44 

14.00 

14.56 

16.24 



BIAS* 



-.60 
-.23 
.16 
.26 
.08 
-.34 
-.36 
-.30 
-.57 
.11 
.43 
.00 
-.09 
-.02 
-.22 
-.14 
.10 
.61 
-.03 
-.28 
.18 
.27 
.40 
.29 
-.15 
-.50 
.41 
-.02 
.05 
-.02 
-.15 
-.01 
-.23 
-.02 
.21 
•.13 
.01 
-.09 
.21 
.38 
.16 
.25 
.03 
.33 
.09 



20 



it 



161 



141 



121 



PLOT OF OCLTA VALUES. niflnlHCl VS FOCAL CKOUP. 



1 

1 1 
1 1 
1 1 
2 2 i 
3 2 
1 4 3 
2 

1 1 
1 1 4 
2 2 
1 2 1 
i 1 



10 



• 10 12 '4 16 II 20 22 fOC 
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TABLE 11 



Item Statistics and DIF Indices for the Delta 
Plot Method: Male VS Female. (N « lOOC) 



FIGURE 30 Plot of Delta Values for Male 
and Female Samples. (N==1000) 



NUMBER 



1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 



P-VALUE 
REF FOC 



Z-VALUE 
REF FOC 



DELTA 
REF FOC 



.92 

.59 

.84 

.58 

.55 

.76 

.92 

.5/ 

.65 

.78 

.50 

.75 

.68 

.54 

.58 

.83 

.79 

.70 

.49 

.58 

.79 

.82 

.84 

.92 

.85 

.78 

.88 

.91 

.82 

.78 

.90 

.79 

.88 

.62 

.70 

.79 

.68 

.78 

.73 

.64 

.61 

.76 

.60 

.64 

.80 



.89 

.51 

.85 

.57 

.54 

.74 

.89 

.53 

.55 

.76 

.50 

.75 

.65 

.52 

.49 

.80 

.76 

.74 

.46 

.53 

.79 

.84 

.87 

.93 

.82 

.74 

.90 

.91 

.81 

.78 

.88 

.77 

.87 

.58 

.76 

.76 

.65 

.75 

.74 

.64 

.63 

.73 

.61 

.67 

.81 



1.41 
.23 
.99 
.20 
.13 
.71 

1.41 
.18 
.39 
.77 
.00 
.67 
.47 
.10 
.20 
.95 
.81 
.52 

-.03 
.20 
.81 
.92 
.99 

1.41 
1 .04 
.77 
1M7 
1.34 

.92 

.77 
1.28 

.81 
1.17 

.30 

.52 

.81 

.47 

.77 

.61 

.36 

.28 

.71 

.25 

.36 

.84 



1.2:^ 

.03 

1 04 
.18 
.10 
.64 

1.23 
.08 
.13 
.71 
.00 
.67 
.39 
.05 
-.03 
.84 
.71 
.64 
-.10 
.08 
.81 

.99 
1.13 
1.48 

.92 

.64 
1.28 
1.34 

.88 

.77 
1.17 

.74 
1.13 

.20 

.71 

.71 

.39 

.67 

.64 

.36 

.33 

.61 

.28 

.44 



18.64 

13.92 

16.96 

13.80 

13.52 

15.84 

18.64 

13.72 

14.56 

16.08 

13.00 

15.68 

14.88 

13.40 

13.80 

16.80 

16.24 

15.08 

12.88 

13.80 

16.24 

16.68 

16.96 

18.64 

17.16 

16.08 

17.68 

18.36 

16.68 

16.08 

18.12 

16.24 

17.68 

14.20 

15.08 

16.24 

14.88 

16.08 

15.44 

14.44 

14.12 

15.84 

14.00 

14.44 

16.36 



17.92 

13.12 

17.16 

13. 72 

13.40 

15.56 

17.92 

13.32 

13.52 

15.84 

13.00 

15.68 

14.56 

13.20 

12.88 

16.36 

15.84 

15.56 

12.60 

13.32 

16.24 

16.96 

17.52 

18.92 

16.68 

15.56 

18.12 

18.36 

16.52 

16.08 

17.68 

15.96 

17.52 

13.80 

15.84 

15.84 

14.56 

15.68 

15.56 

14.44 

14.32 

15.44 

14.12 

14.76 

16.52 



BIAS 



-.48 

-.38 
.21 
.12 
.10 

-.08 

-.48 

-.10 

-.57 

-.06 
.20 
.11 

-.08 
.05 

-.46 

-.22 

-.18 
.47 
.01 

-.16 
.10 
.28 
.46 
.21 

' 26 

-.26 
.36 
.03 
.03 
.10 
.27 
.10 
.06 

.12 

.66 

.18 

.08 

.17 

.20 

.15 

.30 

. 17 
25 

. J7 

.20 



22\ 

I 
J 
I 

f 
f 

20! 



I 
I 
I 

r 

HI 
I 
I 
I 
I 
I 
I 
I 

161 
I 



141 

\ 

I 
I 
I 
I 
I 

121 



I 



101 

» 
I 
I 



^LOT Of OCLTA VALUES. RtreHtHCe VS FOCAL CuOUP. 



2 1 1 

I 

1 1 
2 1 I 
2 1 



2 

1 2 
2 2 2 1 
1 2 
1 1 



4 0 

7 

i 1 



10 



12 



14 



16 



70 



77 roc 
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TABLE 12 



f^!^-^^!![ °^ St;»tistics Across Independent Samples (Black vs White) 

N=200 N-500 N-750 N=1000 



N=200 



Delta 



O-'^^e 0.137 0.159 

f;^^°" 0.148 0.158 0.242 

0.303 0.352 0.311 



N=500 



Delta 
Rasch 
M-H 



0.416 0.412 
0.427 0.498 
0.468 0.719 



N=750 



Delta 
Rasch 
M-H 



0.354 
0.486 
0.738 



Jiotej Stability, indicated by rank order correlations. 
TABLE 13 

!-!^^i:^!L°!_?^!_?!!!^^^^'^^ Across Independent Samples (Male vs Female) 

N=200 N=500 N=750 N=1000 

N=200 



Delta 



M-H 

N=500 



Rasch 0 3.69 



0.112 0.116 0.104 



0.481 0.309 
0.313 0.358 0.401 



Delta 
Rasch 
M-H 



0.425 0.317 
0.335 0.468 
0.244 0.345 



N=750 



Delta ^ „ 

Rasch 0-602 

— 

0.518 



: Stability indicated by rank order correlati 



ons , 
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TABLE 14 





750 


vs 1000 


500 


vs 1000 


200 vs 


1000 








PI 


P2 


PI 


P2 


PI 


P2 






Delta 


0.950 


0.04 (2) 


0.950 


0.02 (1) 


1.000 


0.0 


(0) 




Rasch 


0.930 


0.04 (2) 


0.930 


0.02 (1) 


0.930 


0.02 


(1) 




M-H 


0.840 


0.75 (34) 


0.880 


0.73 (33) 


0.220 


0.06 


(3) 





Note: Pl=proportion of total hits; P2-r>roportion of true positives; 
Numbers m parenthesis indicate the number of items. 



TABLE 15 



f!!^!:^i^^ °^ -itatistics Across Independent Samples (Male vs Female 







750 


vs 1000 




500 


VS 1000 


200 VS 


1000 








PI 


P2 




PI 




PI 


P2 




Delta 


1. 


000 


0.00 (0) 


1. 


000 


0.00 (0) 


1.000 


0. 00 


(0) 


Rasch 


0. 


970 


0.00 (0) 


0. 


970 


0.00 (0) 


0 . 930 


0. 02 


(1) 


M-H 


0. 


840 


0.04 (2) 


0. 


800 


0.02 (1) 


1.000 


0. 00 


(0) 



Note: Pl=proportion of total hits; P2=proportion of true positives; 
Numbers in parenthesis indicate the number of items. 
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TABLE 16 

Agreement of DIF Statistics Across Techniques. (Black vs White) 

Delta Rasch M-H 3 Parameter 

Delta " 

2!^°° 0.206 0.006 0.012 

^Z?° 0-906 0.195 0.086 

vlinnn °-°24 0.113 

N-1000 O.&Ol 0.015 0.476 

Rasch 



N=200 
N=500 
N=750 
N=1000 



0.086 
0.108 
0.094 
0.033 



M-H 



0.126 
0.350 
0.410 
0.535 



N=200 „ 

N=500 

N=750 0'367 

N=1000 

0.236 

Note^ Agreement indicated by Rank Order Correlations. 
TABLE 17 



-?f!!!!™_°f.°f^ statistics Across Techniqi^es. (Male vs Female) 

°f^ta Rasch M-H 3 Parameter 

Delta ~ 



N=200 
N=500 
N=750 
N=100'^ 

Kasch 

N=200 
N=500 
N=750 
N=1000 

M-H 



0. 072 
0.867 
0.901 
0. 880 



0.119 
0.184 
0.213 
0.062 



0.265 
0.212 
0.208 
0.310 



0. 065 
0. 136 
0.051 
0.218 



0. 212 
0.310 
0. 371 
0. 510 



N=200 
N=500 
N=750 
N=1000 



0. 033 
0. 269 
0. 244 
0. 2b8 



Agreement indicated by Rank order Correlations, 
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TABLE 18 



Agreement 


of Three 


DIE Techniques with the Three- 
(Black vs. White) 


•Parameter Model 










N = 


1000 


N = 


= 750 


N 


= 500 


N = 


200 






PI 


P2 


PI 


P2 


PI 


P2 


PI 


P2 


Delta 




0.930 


0.02 (1) 


0.880 


0.02 (1) 


0.930 


0.02 (1) 0 


.930 


0. 000 


Rasch 




0.890 


0.02 (1) 


0.910 


0.02 (1) 


0.910 


0.00 (0) 0 


.910 


0. 000 


M-H 




0.220 


0.06 (3) 


0.240 


0.04 (2) 


0.260 


0.04 (2) 0 


.870 


0.000 


Note. 


Pl=propoLtion of total 
numbers in parentheses 


hits; P2 
indicate 


=proportion 
the number 


of true positive; 
of items. ^ 






TABLE 


19 


















Agreement Across Three-DIF Techniques with the Three-Parameter Model 

(Male vs. Female) 










N = 


1000 


N = 


750 


N 


= 500 


N = 


200 






PI 


P2 


PI 


P2 


PI 


P2 


PI 


P2 


Delta 




0.970 


0.02 (1) 


0.960 


0.00 (0) 


0.960 


0.00 (0) 0. 


960 


0. 000 


Rasch 




0.970 


0.00 (0) 


0.950 


0.00 (0) 


0.950 


0.00 (0) 0. 


950 


0. 000 


M-H 




0.880 


0.04 (2) 


0.950 


0.02 (1) 


0.880 


0.00 (0) 0. 


950 


0. 000 


Note. 


Pl=proportion of total 
Numbers in parantheses 


hits; P2= 
indicate 


proportion 
the number 


of true positive; 
of items. 
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